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Abstract 

Background: Sm-lil<e proteins are liiglily conserved proteins tliat form tine core of tlie U6 ribonucleoprotein and 
function in several mRNA metabolism processes, including pre-mRNA splicing. Despite their wide occurrence in all 
eukaryotes, little is known about the roles of Sm-like proteins in the regulation of splicing. 

Results: Here, through comprehensive transcriptome analyses, we demonstrate that depletion of the Arabidopsis 
supersensitive to abscisic acid and drought 1 gene {SADl), which encodes Sm-like protein 5 (LSmS), promotes an 
inaccurate selection of splice sites that leads to a genome-wide increase in alternative splicing. In contrast, 
overexpression of SADl strengthens the precision of splice-site recognition and globally inhibits alternative splicing. 
Further, SADl modulates the splicing of stress-responsive genes, particularly under salt-stress conditions. Finally, we find 
that overexpression of SADl in Arabidopsis improves salt tolerance in transgenic plants, which correlates with an 
increase in splicing accuracy and efficiency for stress-responsive genes. 

Conclusions: We conclude that SADl dynamically controls splicing efficiency and splice-site recognition in Arabidopsis, 
and propose that this may contribute to 5/\D /-mediated stress tolerance through the metabolism of transcripts expressed 
from stress-responsive genes. Our study not only provides novel insights into the function of Sm-like proteins in splicing, 
but also uncovers new means to improve splicing efficiency and to enhance stress tolerance in a higher eukaryote. 



Background 

Immediately following transcription, many eukaryotic pre- 
cursor messenger RNAs (pre-mRNA) are subjected to a 
series of modifications that are essential for the matur- 
ation, nuclear export and subsequent translation of these 
transcripts. One such modification, the removal (splicing) 
of non-protein-coding sequences from pre-mRNA, is an 
important step in gene regulation that also contributes to 
increased protein diversity from a limited number of genes. 
The precision and efficiency of splicing are critical for gene 
function [1]. A non-precision splicing process would 
generate aberrant or non-functional mRNAs that are 
not only wasteful but can also lead to the production of 
unwanted or harmful proteins that may perturb normal 
cellular processes. Moreover, incorrectly spliced transcripts 
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might also have a profound impact on other processes, 
including mRNA transcription, turnover, transport and 
translation. Accumulating evidence indicates that poor 
efficiency or defects in splicing can lead to diseases in 
humans [2,3] and increase sensitivity to abiotic or biotic 
stresses in plants [4-6]. Although many molecular pro- 
cesses related to splicing have been well characterized, 
we still face a major challenge in understanding how 
precision and efficiency in splicing are regulated and how 
we could harness these regulations to enhance cellular 
functions. 

Sm-like proteins (LSms) are a highly conserved family 
of proteins in eukaryotes both in terms of sequence and 
functions. LSms typically exist as heptameric complexes 
and play roles in multiple aspects of RNA metabolism [7-9]. 
The heptameric LSml-7 cytoplasmic complex is located 
in discrete cytoplasmic structures called P-bodies, which 
are conserved in all eukaryotes and are thought to be 
involved in decapping and 5' to 3' RNA degradation 
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[10,11]. The LSm2-8 heptameric complex is located in 
the nucleus. This complex directly binds and stabilizes 
the 3 '-terminal poly(U) tract of U6 small nuclear RNA, 
forms the core of the U6 small nuclear ribonucleoproteins 
(RNPs) and functions in pre-mRNA splicing [12,13]. The 
Arabidopsis supersensitive to abscisic acid (ABA) and 
drought 1 {SADl) gene locus encodes the LSmS protein 
and was identified in a genetic screen for components 
that regulate the expression of stress-responsive genes 
in our previous work [14]. SADl directly interacts with 
two other subunits, LSm6 and LSm7, and is a compo- 
nent of the LSm2-8 nuclear complex [15]. Dysfunction 
of SADl increases the plants sensitivity to salt stress 
and to the stress hormone ABA in seed germination and 
root growth; moreover, sadl mutants are defective in the 
positive feedback regulation of ABA biosynthesis genes by 
ABA and are impaired in drought stress induction of 
ABA biosynthesis, although the detailed molecular bases 
for these defects have not been identified. Recent studies 
suggested that the depletion of SADl and the other 
LSm protein (LSmS) reduced the stability of U6 RNPs 
and resulted in defects in pre-mRNA splicing that lead 
to intron retention in Arabidopsis [15,16]. However, it is 
still unclear if the depletion of SADl or other LSm pro- 
teins has any effect on the selection of splice sites and 
alternative splicing (AS), and whether overexpression 
of these LSm proteins could affect splicing efficiency 
or accuracy. 

To investigate possible regulatory roles of SADl protein 
in pre-mRNA splicing, we performed RNA sequencing 
(RNA-seq) of the wild-type Arabidopsis (C24 ecotype), 
the sadl mutant and the 5'ADi-overexpressing plants 
(SADl-OE). We found that SADl could dynamically con- 
trol splicing efficiency and splice-site recognition and 
selection in Arabidopsis. Additionally, we discovered that 
SADl is required for regulation of splicing efficiency of 
many stress-responsive genes under stress conditions. 
Whereas there are increased splicing defects in sadl 
mutants under salt-stress conditions, overexpression of 
SADl increases the splicing efficiency of stress-related 
genes. SADl-OE plants are also more salt-tolerant than 
wild-type plants. Our work not only provides novel in- 
sights into the regulatory role of SADl and LSm proteins 
in splicing, but also suggests a new way to improve spli- 
cing efficiency and to optimize cellular functions and 
generate stress-resistant plants. 

Results 

RNA sequencing of wild-type, sadl mutant and 
S^D7-overexpressing plants 

The Arabidopsis sadl mutant was isolated in our previous 
genetic screen for components that regulate the expres- 
sion of stress-responsive genes [14]. The sadl mutant 
was also more sensitive to stress and ABA inhibition of 



seed germination and seedling growth [14]. Since loss- 
of-function mutations in any single-copied core LSm 
genes are expected to be lethal, the recovery of this 
point-mutation sadl mutant provided an invaluable 
opportunity to study the functions of this important 
group of proteins. To explore the role of SADl in gene 
expression and stress tolerance, we generated transgenic 
Arabidopsis plants over-expressing the wild-type SADl 
cDNA (under control of the cauliflower mosaic virus 
3SS promoter) both in the wild type (ecotype C24) and 
in the sadl mutant background (SADl-OE, see Methods). 
Although the transgenic plants in both backgrounds 
had similar physiological and molecular phenotypes, here 
we mainly focus on SADl-OE in the sadl mutant back- 
ground (referred to as SADl-OE hereafter). 

As shown in Figure lA, over expressing wild- type SADl 
rescued the small stature phenotype of the sadl mutant, 
demonstrating that the phenotypic defects of the sadl 
mutant were caused by the loss of the wild-type SADl 
protein. We genotyped these seedlings using primers 
that span the whole gene body. The PGR products in 
the SADl-OE plant had two bands, representing the 
original SADl gene and the transferred cDNA, respect- 
ively (Figure IB). 

We next performed RNA-seq using the lUumina HiSeq 
platform (Illumina Inc., San Diego, CA, USA) on two- 
week-old seedlings of C24 (wild type), sadl and SADl-OE. 
These seedlings were subjected to two treatments: control 
(H2O) and salt stress (300 mM NaCl, 3 h). The salt- 
stress treatment was based on our previous observations 
that stress-responsive genes were most obviously acti- 
vated and that the mutant sadl showed strong molecu- 
lar phenotypes under these conditions [14,17]. Based on 
six cDNA libraries (C24-control, sadl -control, SADl- 
OE-control, C24-NaCl, sadl-N^iCl and SADl-OE-NaCl), 
we generated a total of 164 million reads (101 bp in 
length, except for the SADl-OE control, whose reads 
were 85 bp in length), about 90% of which could be 
uniquely aligned to the TAIRIO reference genome se- 
quence (version TAIRIO; [18]) (Additional file 1). Com- 
parison of the mapped reads to the gene model (version 
TAIRIO) revealed that approximately 95% of the reads 
mapped to the exonic regions, whereas only about 3% 
mapped to intergenic regions (Additional file 2), which 
were consistent with the Arabidopsis gene annotation. 
Plotting the coverage of reads along each transcript 
exhibited a uniform distribution with no obvious 375' 
bias, which reflects the high quality of the cDNA libraries 
(Additional file 3). Furthermore, assessing the sequencing 
saturation demonstrated that as more reads were 
obtained, the number of new discovered genes plat- 
eaued (Additional file 4). This suggests that exten- 
sive coverage was achieved, which can also be seen 
when the read coverage was plotted by chromosome. 
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Figure 1 (See legend on next page.) 
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Figure 1 Generation of the 5^D7-overexpressing transgenic plants (SAD1-0E) and the splicing variants of SAD1 in the wild type, sad! 
and SAD1-0E. (A) Morpliology of tine wild type, sod! and SADl-OE seedlings in soil. (B) Genotype analysis of plants shown in (A). The upper 
and lower bands of PGR products represent the endogenous SADl gene and the transgenic cDNA, respectively. (C) RNA-seq reads were visualized 
by the Integrative Genomics Viewer (IGV) browser across the SADl gene. Exon-intron structure was given at the bottom of each panel. The arcs 
generated by IGV browser indicate splice junction reads that support the splice junctions. The grey peaks indicate RNA-seq read density across 
the gene. The upper panel depicts the mutation of sad! that changed the wild-type invariant dinucleotide AG to AA at the 3' splicing acceptor 
recognition site of the first intron. The middle panel shows transcripts with two aberrant 3' splice sites (3'SSs) that respectively occurred at the 
20 bp (enlarged and marked by 3) and 2 bp (enlarged and marked by 2) downstream of the mutated splice site and transcripts with the retention 
of the first intron (marked by 1) in sodl. Also shown are SADl transcripts in the wild type where they were normally spliced. (D) Three variants of 
SADl transcripts discovered in sadl by RNA-seq were validated by RT-PCR using junction-flanking primers. The three bands in the sadl mutant 
from top to bottom represent transcripts with the first intron retained, the first aberrant 3'SSs and the second aberrant 3'SSs, respectively. Note in 
the wild-type and SADl-OE plants, only one wild-type SADl band was detected. (E) SADl expression levels were shown using the reads per kilobase 
per million value and quantitative RT-PGR. bp, base pairs; RPKM, reads per kilobase per million; sadl, sadl mutant; SADl-OE, plants over-expressing 
wild-type SADl in the sadl mutant background; WT, wild type. 



demonstrating extensive transcriptional activity in the 
genome (Additional file 5). 

We previously identified the sadl mutation as a G-to-A 
change 34 bp from the putative translation start site 
and predicted that the mutation would change a glutamic 
acid (E) residue to lysine (K). In the RNA-seq data, the 
mutation of sadl at the genomic position 19,813,556 of 
chromosome 5 was confirmed. However, it turned out 
that the mutation occurred at the 3' splicing acceptor 
recognition site of the first intron, changing the invariant 
AG dinucleotide to AA. Consequently, all of sadl mRNAs 
were aberrantly spliced in the mutants, as visualized with 
the Integrative Genomics Viewer (IGV) browser [19,20] 
(Figure IC). We identified three main mutant transcripts 
in sadl: two with obvious aberrant 3' splice sites (3'SSs) 
that respectively occurred 2 bp and 20 bp downstream 
of the mutated splice site; and one with the retention of 
the first intron (Figure IC). All of these transcripts 
were validated by RT-PCR using primers that span the 
alternative 3'SSs, in which the corresponding events were 
detected in the sadl mutant, but not in C24 (Figure ID). 
Sequence analysis suggested that the transcript with 
aberrant 3'SSs that occurred 20 bp downstream of the 
mutated splice site did not alter the coding frame. It 
was predicted to produce one novel protein with the 
deletion of seven amino acids compared to the normal 
SADl protein. It seems that this mutant protein could 
provide some of the wild-types functions such that the 
sadl mutation was not lethal. By contrast, the other alter- 
native 3'SS and the intron retention led to a coding-frame 
shift that would generate a premature stop codon and 
thus would lead to truncated proteins. In the SADl-OE 
plant, all these aberrantly spliced forms could be found, 
albeit at much lower levels than in sadl. However, nor- 
mal SADl mRNA was overexpressed, with the transcript 
level more than 10- times higher than in C24, which was 
validated by quantitative RT-PCR (Figure IE). 



Identification of alternative splicing events in C24, sadl 
and SADl-OE plants 

To determine if there were any changes in pre-mRNA 
splicing upon the depletion or overexpression of SADl, 
we first developed a pipeline to identify all AS events in 
C24, sadl and SADl-OE. The pipeline involved three 
steps: prediction of splice junctions, filtering of the false 
positive junctions and annotation of AS events. We 
randomly sampled 20 million uniquely mapped reads 
(estimated average approximately 57-times coverage on 
all the expressed transcripts) from each RNA-seq library 
for the identification or comparison of AS, respectively. 
This method ensured that the comparison of AS events 
would be performed at the same level. 

To predict splice junctions, we mapped the RNA-seq 
reads onto the Arabidopsis genome using the software 
TopHat, which was designed to identify exon-exon splice 
junctions [21]. After the alignment, we identified 732,808 
junctions from the six RNA-seq libraries. Comparison of 
these junctions to the gene annotation (TAIRIO) revealed 
that about 83% of total junctions had previously been 
annotated, and the remaining 17% were assigned as novel 
junctions (Additional file 6A). However, when trying to 
characterize these novel and annotated junctions, we 
found that there was a large number of novel junctions 
that had short overhangs (that is, fewer than 20 bp) with 
the corresponding exons, while most of the annotated 
junctions had large overhangs, with the enrichment at 
approximately 90 bp (Additional file 6B). Moreover, the 
novel junctions had relatively low coverage compared with 
the annotated junctions (Additional file 6C). In general, 
the junctions with short overhangs and lower coverage 
were considered as false positives, which are often caused 
by non-specific or error alignment. Therefore, to distin- 
guish between true splice junctions and false positives, 
we assessed the criteria based on simulated data of a set 
of randomly constituted junctions. To do this, we first 
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generated a set of 80,000 splice junctions in which anno- 
tated exons from different chromosomes were randomly 
selected and spliced together in silico. We also constructed 
119,618 annotated junctions from the gene annotation. 
Since the length of our sequencing reads was 101/85 bp, 
the splice junction sequences were determined to be 
180/148 bp long (90/74 nucleotides on either side of the 
splice junction) to ensure an 11 bp overhang of the read 
mapping from one side of the junction onto the other. 
Alignments to the random splice junctions were considered 
to be false positives, because such junctions are thought 
to rarely exist when compared to annotated junctions. 
The alignment of the raw RNA-seq reads to the random 
junctions revealed that 99.90% of false positive junctions 
had an overhang size of fewer than 20 bp (Additional 
file 7A). In sharp contrast, the alignment to the anno- 
tated junctions indicated that most (98.60%) annotated 
junctions had larger overhang sizes. In addition, we esti- 
mated that 56.90% of false positive junctions had only 
one read spanning the junction, whereas the annotated 
junctions had higher read coverage (Additional file 7B). 
To minimize the false positive rate, we required that 
the overhang size must be more than 20 bp and that 
there be at least two reads spanning the junctions. Using 
these criteria, we filtered out almost all the false positive 
junctions (Additional file 7C). Finally, we obtained a junc- 
tion data set of 52,599 confident novel junctions from the 
six RNA-seq libraries. Based on these junctions, we identi- 
fied all the AS events including cassette exons, alternative 
5'SSs, alternative 3'SSs, mutually exclusive exons, coord- 
inate cassette exons, alternative first exons and alternative 
last exons (Additional file 8). 

Depletion of SAD1 activates alternative splicing 

We first compared the difference in AS between C24 
and the sadl mutant. By comparing the number of AS 
events, we found that the alternative 5'SSs and exon- 
skipping events were consistently promoted in the con- 
trol and NaCl-treated mutants (Figure 2 A; Additional 
file 9A). Furthermore, the number of splice junction 
reads from alternative 5'SSs and exon-skipping events in 
the mutant was significantly higher than that in the wild 
type (Fisher s exact test, P <0.001) (Figure 2B; Additional 
file 9B). Using Fishers exact test on the junction read 
counts and the corresponding exon read counts between 
the wild type and the mutant, we identified 478 alterna- 
tive 5'SSs and 138 exon-skipping events from 550 genes 
that were significantly over-represented in the control or 
NaCl-treated mutants; by contrast, we identified only 
133 alternative 5'SSs and 41 exon-skipping events from 
171 genes that were over-represented in the correspond- 
ing wild type (Additional files 10, 11, 12 and 13). These 
results indicated that SADl depletion increased alternative 
5'SSs and exon-skipping events. In addition, the alternative 



3'SSs showed significant increases in the NaCl-treated 
mutant. We identified 319 alternative 3'SSs that were 
over-represented in the mutant; by contrast, 142 were 
over-represented in the wild type (Additional files 14, 15). 
This result suggests that SADl -depletion could also pro- 
mote alternative 3'SSs under salt-stress conditions. 

Twenty-two selected events were further validated by 
RT-PCR using the splicing- site-flanking primers, in which 
the corresponding AS events were detected in sadl mu- 
tants, but were weakly or not presented in C24 (Figure 2C 
and Additional file 16). Figure 2C highlights three 
representative examples visualized by the IGV junction 
browser and validated by RT-PCR. The SBIl (AT1G02100) 
gene had alternative 5'SSs in the 10th intron in sadly but 
not in C24, an observation validated by RT-PCR using the 
forward primer that covered the splice junction and the 
reverse one that was located at the 11th exon. One can 
see that the corresponding isoform was detected in the 
sadl mutant, but was not present in C24 (Figure 2C). 
The HINTS (AT5G48545) gene had alternative 3'SSs in 
the fifth exon in the mutant sadly which was validated 
by RT-PCR using a forward primer in the first exon and 
a reverse primer that covered the splice junction (Figure 2C). 
The gene PAC (AT2G48120) exhibited exon-skipping 
between the third and fifth exons, which was validated 
by RT-PCR using primers at the third and sixth exon, 
which meant that two different products were amplified, 
representing exon inclusion and skipping isoforms, re- 
spectively (Figure 2C). 

Sequence analysis of these over-represented alternative 
5'SSs and alternative 3'SSs (in the NaCl-treated sadl 
mutant) revealed that these activated splice sites were still 
associated with GU and AG dinucleotides (Figure 2D; 
Additional file 17A), suggesting that the depletion of 
SADl did not change the accuracy of the sequence rec- 
ognition of the splicing sites. When investigating the 
distribution of these activated splice sites, we found that 
alternative 5'SSs and 3'SSs were enriched in the down- 
stream or upstream approximately 10 bp region of the 
dominant 5'SSs and 3'SSs, respectively (Figure 2E; 
Additional file 17B). This indicates that the depletion 
of SADl leads to the activation of the 5'SSs and 3'SSs 
proximal to the respective dominant ones. These results 
suggest that SADl, as a component of U6 RNPs, may play 
a regulatory role in the selection of splice sites. 

Interestingly, exon-skipping events also increased in sadl 
mutants. When correlating each exon-skipping event with 
alternative 5'SSs and 3'SSs, we found that about 20% of 
the skipped exons simultaneously had alternative 5'SSs 
or 3'SSs in the mutants. This chance of occurrence was 
significantly higher than that expected for random sam- 
pling of all annotated exons (the probability of random 
occurrence was 0.02%, Fishers exact test, P <0.001). This 
result suggests a coordinated occurrence of exon-skipping 
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and alternative splice site selection. Therefore, we con- 
sidered that SAD 1 -depletion could simultaneously acti- 
vate multiple alternative 5'SSs or 3'SSs that include not 
only the proximal ones, but also the distal ones, includ- 
ing those located at the next exons, albeit to a lesser 
extent. Nonetheless, the possibility that SADl, probably 
as a splicing factor, may directly regulate exon-skipping 
in vivo could not be ruled out. 

SADl depletion results in widespread intron retention 

Based on DNA chip and RT-PCR analyses, very recent 
studies have suggested that the depletion of SADl and 
other LSm proteins can result in defects in intron removal 
[15,16]. Nonetheless, genome-wide analyses at the single 
nucleotide level of splicing defects in these mutants are 
not available. Based on our RNA-seq data, we plotted 
the expression intensity of introns and exons between 
the wild-type C24 and sadl mutants (Figure 3; Additional 
file 18). Figure 3 clearly shows a global up-regulation of 
intron expression in the mutants, but this was not seen 
for exon expression, suggesting widespread intron reten- 
tion in the mutant. Ten selected events were further val- 
idated by RT-PCR using the intron-flanking primers, in 
which the corresponding intron retention events were 
detected in sadl mutants, but were weakly or not pre- 
sented in C24 (Additional file 19). Using Fishers exact 
test, we compared the counts of intron reads and the cor- 
responding counts of exon reads between the wild type 
and mutants. We identified 4,610 introns from 2,737 genes 
that were significantly retained in the control or NaCl- 
treated mutants {P <0.001) (Additional file 20). By con- 
trast, only 23 introns from 20 genes were significantly 
retained in the corresponding wild-type plants (Additional 
file 21). This result further demonstrated that SADl 
depletion results in widespread intron retention. 



We next investigated if there is any influence of the 
splicing defects on the expression of the affected genes. 
Sequence analysis suggested that all of these intron re- 
tention events would generate premature stop codons in 
the intron- retained transcripts and, if translated, would 
produce truncated proteins. Although it is possible that 
some individual truncated proteins might still be func- 
tional, for our sequence analyses, we assumed that these 
intron-retained transcripts do not generate functional pro- 
teins. Through calculating the proportions of the intron- 
retained transcripts to the total transcripts for each gene 
with intron-retention in the mutant, we estimated that on 
average around 15% of total transcripts were with intron 
retention (Additional file 22). Moreover, when plotting the 
expression levels of the total and the functional transcripts 
(without intron) for each intron-retained gene between 
the wild type C24 and sadl mutants (Additional files 23 
and 24), we found that the expression levels of the total 
transcripts did not obviously change between C24 and 
sadl, but the functional transcripts tended to be down- 
regulated in the mutant. These results indicate that the 
splicing defects are associated with a global reduction 
of functional mRNAs, which might negatively affect the 
functions of these affected genes. 

Genes with aberrant splicing in sadl are closely related to 
stress response and are activated by stress 

We further analyzed functional categories and pathways 
of the genes with abnormal splicing in the sadl mutants. 
We identified 3,354 genes with abnormal splicing in con- 
trol or NaCl-treated sadl mutants, the majority of which 
were with intron retention. Moreover, 83% of these genes 
were unique to either the control treatment or the NaCl 
treatment, suggesting that abnormal splicing may be 
specific to different treatments. An analysis of functional 
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categories using the software DAVID [22,23] revealed that 
these abnormally spliced genes were significantly enriched 
at several biological processes, including response to abi- 
otic stimulus, response to stress, photosynthesis, and pro- 
tein transport, suggesting that SADl is involved in 
multiple biological processes through regulating pre- 
mRNA splicing (Additional files 25 and 26). Interestingly, 
we observed a striking enrichment at the response-to- 
abiotic-stress pathways, which were commonly observed 
in both treatments (Figure 4A; Additional file 27). Further 
analysis using Gene vest igat or [24] showed that the stress- 
responsive genes with abnormal splicing in NaCl-treated 
sadl mutants were closely associated with the response to 
salt and ABA stresses (Figure 4B); whereas, those in sadl 
under the control condition were not associated with the 
response to salt and ABA stresses (Additional file 28), but 
rather related to the response to various other environ- 
mental stresses. These results not only are consistent with 
the salt-sensitive phenotypes of sadl mutants, but also 
suggest that SADl plays critical roles in effectively regulat- 
ing splicing of stress-responsive genes under stress condi- 
tions. Meanwhile, we found that genes with splicing 
defects coincided with those regulated by transcriptional 
activation under the respective treatments (shown in 
Figure 4B), which suggests that the occurrence of the spli- 
cing defects could follow or co-occur with transcriptional 
activation. 

Further analysis using Mapman [25] suggested that 
genes with aberrant splicing in sadl mutants are in- 
volved in various stress response pathways, including 
hormone-signaling pathways, MAPK-signaling pathways 
and transcription regulation (Figure 4C; Additional file 
29). Notably, some important genes (such as SnRKZl 
and 22, S0S2, DREB2A, NHXl, WRKY33, WRKY2S, 
STT3A, CAXl and RCI2A) involved in stress responses 
were identified to have splicing defects in the sadl 
mutant. Among these genes, SnRK2,l and 2,2 encode 
members of SNFl -related protein kinases activated by 
ionic (salt) and non-ionic (mannitol) osmotic stress that 
are required for osmotic stress tolerance [26]; S0S2 
encodes a protein kinase essential for salt tolerance 
[27]; DREB2A encodes a transcription factor that acti- 
vates drought and salt stress-responsive genes [28]; 
NHXl encodes a vacuolar sodium/proton antiporter 
whose overexpression increases salt tolerance in sev- 
eral plant species including Arab id ops is [29]; WRKY33 
and WRKY2S encode plant WRKY transcription factors 
involved in response to salt and other stresses [30,31]; 
STT3A encodes an oligosaccharyl transferase whose 
knockout mutants are hypersensitive to high salt con- 
ditions [32]; CAXl encodes a high affinity vacuolar cal- 
cium antiporter and can be activated by SOS2 to 
integrate Ca^"^ transport and salt tolerance [33]; and 
RCI2A (Rare-cold inducible 2A), whose product plays 



a role in preventing over-accumulation of excess Na"^ 
and contributes to salt tolerance [34]. These genes showed 
increased intron retention in the mutants, which were 
also validated by RT-PCR using intron-flanking primers 
where the corresponding intron-retained transcripts 
were more obviously identified in sadl, consistent with 
the RNA-seq data (Figure 4D). Above all, these results 
suggest that genes with aberrant splicing in sadl are 
closely related to stress response, which could directly or 
indirectly contribute to the stress sensitivity of the sadl 
mutant. 

Overexpression of SADl rescues the splicing defects in 
the sadl mutant and strengthens splicing accuracy under 
salt stress 

To address the question whether the splicing defects 
seen in sadl mutants result from loss of the wild-type 
SADl protein, we overexpressed the wild-type SADl 
cDNA in the sadl mutant, and performed RNA-seq on 
the rescued plants (SADl-OE). We first compared the 
expression levels of splice junctions in SADl-OE, C24 
and sadl. We found that the AS events previously seen 
in sadl were completely or at least partially suppressed 
in SADl-OE plants (Figure 5A; Additional file 30), dem- 
onstrating that overexpression of SADl was sufficient to 
rescue the sadl -dependent AS defects. While our previ- 
ous study indicated that the sadl mutation was recessive 
with regard to the morphological, physiological and 
stress-inducible gene expression phenotypes [14], we 
could not rule out the possibility that an isoform of the 
sadl mutant protein (for example, isoform 3, Figure ID) 
might have a dominant- negative effect that could be 
partly responsible for the SADl-OE s incomplete rescue 
of some of the splicing defects in sadl. Interestingly, 
when comparing the number of AS events between C24 
and SADl-OE, we found that the numbers of alternative 
5'SSs, alternative 3'SSs and exon-skipping in the NaCl- 
treated SADl-OE were obviously smaller than those in 
the corresponding C24 (Figure 5B), and the numbers of 
corresponding junction reads were also significantly 
lower {P <0.001) (Figure 5C). These results were not ob- 
served in the control treatment (Additional file 31). 
These observations indicate that overexpression of SADl 
could inhibit AS under salt-stress conditions. Using 
Fishers exact test, we identified 454 alternative 5'SSs, al- 
ternative 3'SSs and exon-skipping events from 434 
genes that were significantly absent in NaCl-treated 
SADl-OE (Additional file 32). Further analyses showed 
that these alternative 5'SSs and 3'SSs are still associated 
with GU or AG dinucleotides (Figure 5D) and enriched 
downstream or upstream of the dominant 5'SSs and 3' 
SSs (Figure 5E), suggesting that overexpression of SADl 
inhibits the usage of alternative 5'SSs and 3'SSs and 
promotes the usage of the dominant ones. Together with 
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Figure 4 Genes with abnormal splicing in sad! are closely associated with stress response and transcriptional activation. (A) A two-dimensional 

view of tine relationsliip between tine genes witli abnormal splicing and their functional annotations generated by the DAVID software. The top 

50 functional annotations that were ordered by the enrichment scores were selected for the two-dimensional view, which indicates that genes 

with abnormal splicing were strikingly enriched (colored green) in the response-to-abiotic-stress category. (B) A heatmap was generated by 

mapping the genes enriched at the response-to-abiotic-stress pathways to the microarray database using Genevestigator. The heatmap indicates that 

genes with abnormal splicing in sad! are mostly up-regulated (colored red) by ABA, cold, drought and salt stress but less regulated by biotic stress 

(bacteria infection). (C) A network generated by Mapman indicates that genes with aberrant splicing in sadl are involved in various stress response 

pathways, including hormone-signaling pathways, MAPK-signaling pathways and transcription regulation. (D) Validation of the intron retention in 10 

stress-responsive genes by RT-PCR using the intron-flanking primers. The grey asterisks (*) denote the intron-retained splicing variants. ABA, abscisic 

acid; SA, salicylic acid; JA, jasmonic acid; sadl, sadl mutant; WT, wild type; HSP, heat shock protein; MARK, mitogen-activated protein kinase; 

ERF, ethylene response factor; bZIP, basic-leucine zipper; WRKY, WRKY transcription factor; DOF, DNA-binding with one finger; PR-proteins, 

pathogenesis-related proteins; R genes, (plant disease) resistance genes, 
k J 



the result that SAD 1 -depletion activates the alternative 
5'SSs and 3'SSs, we suggest that SADl could dynamic- 
ally regulate the selection of 5'SSs and 3'SSs and control 
the splicing accuracy and efficiency. 

We further compared the expression levels of introns 
in SADl-OE with those in C24 and sadl. We found that 
the expression of most introns in SADl-OE was restored 
to normal levels (Figure 5F; Additional file 33), demon- 
strating that the intron retention indeed resulted from 
the sadl mutation. Furthermore, using Fisher s exact test 
we identified 76 introns from 75 genes that were signifi- 
cantly absent in NaCl-treated SADl-OE, but were over- 
represented in NaCl-treated C24 (Additional file 34). 
This result shows that 5A£)i -overexpression can increase 
splicing efficiency. 

Overexpresslon of SADl improves plant salt tolerance 

In the NaCl-treated SADl-OE plants, we identified 506 
genes with decreased alternative 5'SSs, alternative 3'SSs, 
exon-skipping or intron retention. Analyses of the expres- 
sion level for these genes demonstrated that their func- 
tional transcripts tended to be up-regulated in SADl-OE 
plants, indicating that overexpresslon of SADl leads to 
the increase of functional mRNAs (Additional file 35). 
Analyses of the functional categories of these genes re- 
vealed that they were strikingly enriched in the group 
of 'response-to-abiotic-stimulus' (Figure 6 A; Additional 
file 36). More specifically, these genes were well associated 
with the response to salt and ABA stresses and transcrip- 
tional activation (Figure 6B). Therefore, overexpresslon 
of SADl can increase splicing accuracy and efficiency of 
stress-responsive genes under stress conditions. This 
result further elucidates the specific regulation of SADl 
in splicing of the stress-related genes and the potential 
relationship between transcription and splicing. 

Further analysis suggested that these genes are involved 
in various stress response pathways (Additional file 37). 
Some of the stress-responsive genes that were more effect- 
ively spliced in SADl-OE included ABF3/ABF2, encod- 
ing ABRE binding factors that mediate ABA-dependent 
stress responses [35,36]; CIPKSy encoding CBL-interacting 



serine/threonine-protein kinase 3 that is involved in the 
resistance to abiotic stresses (for example, high salt, 
hyperosmotic stress) by regulating the expression of 
several stress-inducible genes [37]; and DREB2A, that 
encodes a transcriptional factor mediating high salinity- 
and dehydration-inducible transcription [28]. These genes 
have been reported to be key regulators of ABA or salt- 
stress responses. 

With the increased splicing efficiency in these key reg- 
ulators of ABA or salt-stress responses, we were curious 
to know whether the SADl-OE plants would have im- 
proved tolerance to salt stress. To test this, one-week- 
old seedlings of C24, sadl and SADl-OE grown on the 
regular Murashige and Skoog (MS) agar medium were 
transferred to MS agar plates supplemented with 0 (con- 
trol), 50, 100 or 200 mM NaCl. We found that SADl-OE 
seedUngs showed enhanced tolerance to 100 mM NaCl 
on vertically placed plates (Figure 6C). At 200 mM NaCl, 
however, root elongation of all genotypes was inhibited 
and seedlings were not able to survive an extended period 
of the stress treatment (data not shown). Measuring the 
root growth of the seedlings showed that the roots of 
SADl-OE were longer than those of C24 and sadl at 
100 mM NaCl (Additional file 38). We also tested salt 
tolerance of seedlings on horizontally placed agar medium 
plates. Two-week-old seedlings from Vi MS media were 
transferred onto 200 mM NaCl media and incubated for 
five days. The percentage of green leaf number over total 
leaf number was calculated for each seedling. The data 
indicated that SADl-OE seedlings had a higher percent- 
age of green leaves, suggesting that they were significantly 
less damaged by the salt stress than were the wild-type 
or sadl seedlings (Figure 6D). To test further whether 
SADl-OE plants were tolerant to salt stress at the adult 
stage and in soil, we grew these seedlings in soil and ir- 
rigated with either 50, 100, 150, 200 or 400 mM NaCl 
solutions at intervals of four days (see Methods). After 
two weeks of treatment, we found that sadl plants were 
very sensitive to salt stress at concentrations above 
150 mM and wild-type seedlings also exhibited signs of 
damages at higher salt concentrations as indicated by 
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wilty inflorescence and damaged leaves, whereas the 
SADl-OE plants were not obviously affected by the 
stress treatment and were also taller than the wild-type 
plants (Figure 6E; Additional file 39). These results indi- 
cate that SADl-overexpression improves salt tolerance, 
which correlates with increased splicing accuracy and 
efficiency of stress-responsive genes. 



Discussion 

Although studies in other eukaryotes, and more recently 
in plants, have demonstrated that LSm proteins 2-8, as 
the core of U6 RNPs, function in pre-mRNA splicing, 
whether or not these proteins have any roles in regula- 
tion of splicing efficiency and selection of splice sites 
has not yet been determined. In this study, through 
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comprehensive transcriptome analysis of mutant and 
transgenic plants overexpressing the LSmS gene SADl, 
we demonstrated that SADl could dynamically regulate 
splicing efficiency and selection of splice sites in Arabi- 
dopsis. We also revealed that SADl modulates splicing 
of stress-responsive genes under salt-stress conditions. 
Finally, we showed that overexpression of SADl signifi- 
cantly improved splicing efficiency of the salt-responsive 
genes and resulted in enhanced salt tolerance in trans- 
genic plants. 

We found that SADl depletion activated the alterna- 
tive 5'SSs and 3'SSs proximal to the dominant ones, 
suggesting that the wild-type SADl protein is necessary 
for precise splice-site recognition. To our surprise, rela- 
tive to the wild-type plants, overexpression of SADl can 
strengthen the recognition accuracy and globally inhibit 
AS under salt-stress conditions. Therefore, we conclude 
that SADl can control selection of splice sites and splicing 
efficiency in a manner depending on SADls abundance. 
This kind of splicing regulation, which could be referred 
to as a dynamic model, differs from but complements the 
kinetic model of splicing regulation [1,38,39]. In the kin- 
etic model, the elongation rate of RNA polymerase II (Pol 
II) affects splicing efficiency such that a slower Pol II 
would allow more time for the recognition and processing 
of weak splicing sites so that splicing efficiency is en- 
hanced. In the dynamic model, we reasoned that the 
spliceosome or other complexes involved in splicing are 
under thermodynamic equilibrium between association 
(complex formation) and disassociation (complex break- 
down) at any given condition. A higher dosage of cer- 
tain key small nuclear RNPs or splicing factors may 
drive the reaction toward the formation of the complex 
to enhance splicing efficiency. This dosage-dependent 
control of splicing suggests an alternative splicing regula- 
tion and it may be particularly important for the splicing 
of particular group of genes such as stress-inducible genes 
as discussed below. 

Whereas increased alternative 5'SS usage was seen 
in sadl both under control and salt-stress conditions 
(Figures 2A,B; Additional file 9A,B), the increase of alter- 
native 3'SSs caused by SADl depletion and the inhibition 
of AS caused by SADl overexpression were only observed 
under salt-stress conditions. These findings indicate that 
SADl depletion or overexpression appears to impact spli- 
cing under salt-stress conditions more than under normal 
conditions. We considered that this distinct impact of 
SADl on splicing under normal versus stress conditions 
might have to do with increased transcriptional activa- 
tion of the stress-responsive genes. Under salt-stress or 
other abiotic-stress conditions, plants activate the ex- 
pression of a large number of stress-responsive genes that 
are not expressed or are expressed at lower levels under nor- 
mal non-stressful conditions [40,41]. With the simultaneous 



production of a large amount of these stress-inducible 
pre-mRNAs, cells would need to immediately recruit a 
significant amount of splicing factors and other factors 
for their co-transcriptional or post-transcriptional pro- 
cessing. This imposes a huge burden on the splicing 
machinery and as a result a significant portion of these 
transcripts fail to be processed adequately when the spli- 
cing machinery is compromised. This may be the reason 
why most of the splicing defective genes in sadl are 
stress-regulated (Figure 4). Conversely, a higher SADl 
dosage could play a dominant role in enhancing the spli- 
cing efficiency of these salt-responsive genes through the 
promotion of recruitment and assembly of the splicing 
machinery as discussed above. As a result, the change in 
the AS pattern in SADl-OE plants was more obvious 
under salt-stress conditions than under control condi- 
tions. Thus, the expression of these (and other) highly 
inducible genes may be particularly subjected to the 
dynamic regulation by certain splicing factors, which to 
some extent is similar to the kinetic regulation of splicing, 
both reflecting the saturated capability of cellular machinery. 

We thought that the decreased splicing efficiency of the 
stress-responsive genes might contribute to the stress- 
sensitivity of the sadl mutant. The splicing defects in 
sadl lead to widespread intron retention in many stress- 
responsive genes (317 genes. Additional file 40). These 
genes include those encoding known key determinants 
of salt tolerance such as SnRK2. 1/2.2, S0S2, DREB2A, 
NHXh WRKY33, WRKY25, STT3A, CAXl and RCI2A, 
The expression level of the functional transcripts for 
many of these genes were also found to be down- 
regulated (Additional file 41), although the cause of this 
down- regulation is unclear. AU of these intron-containing 
transcripts were predicted to generate premature stop 
codons and truncated proteins if translated. This large- 
scale 'hidden' change in pre-mRNA splicing efficiency 
or gene expression, although relatively small for some 
of the individual genes, may collectively undermine 
plants readiness for the stress. However, it should be 
pointed that a direct relationship between the splicing 
defects and stress sensitivity in the sadl mutant could 
not be established at this point. 

Interestingly, an increase of splicing efficiency and ex- 
pression of stress-responsive genes correlated with im- 
proved stress tolerance of the plants. Indeed, transgenic 
plants overexpressing SADl exhibited obviously increased 
tolerance to salt stress (Figure 6E), although the magni- 
tude of the increase was moderate. Nonetheless, this find- 
ing is very significant for two reasons. First, it indicates 
that splicing efficiency may play an important role in regu- 
lating plant stress resistance. This is consistent with find- 
ings in several other genetic studies, where certain RNA 
processing factors were also found to be required for 
plant stress resistance. These factors include, for example. 
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ABHl [42], L0S4 [43] and RCFl [44], although the 
mechanisms involved were unclear. Secondly, our find- 
ing provides a new approach to improving plant stress 
resistance, namely, by regulating the splicing efficiency. 
Current methods to increase plant salt tolerance mostly 
involve the overexpression of structure genes such as 
ion transporter genes [29,45,46]. Constitutively express- 
ing these structure genes may cause unwanted side effects 
that would result in reduced yield under normal growth 
conditions. However, enhancing splicing efficiency does 
not affect gene expression under normal conditions, as 
demonstrated in this study. Our finding may also be 
applicable to enhancing stress tolerance or other traits 
in other eukaryotic systems. 

Conclusions 

We demonstrated that SADl dynamically regulates spli- 
cing efficiency and plays a regulatory role in the selection 
of splice sites. Furthermore, we found that SADl specific- 
ally modulates splicing of the stress-responsive genes 
under stress conditions. Finally, we showed that overex- 
pression of SADl improves salt tolerance of transgenic 
plant, which correlates with the increased splicing effi- 
ciency of the salt-stress-responsive genes. Our study 
provided novel insights into the regulatory role of SADl 
or LSm proteins in splicing and also suggested new 
strategies to improving splicing efficiency and bioengin- 
eering stress-resistant plants. 

Materials and methods 

Plant materials and growth conditions 

The Arabidopsis sadl mutant in the C24 background was 
described previously [14]. For overexpressing the SADl 
gene, the SADl cDNA, amplified from the wild-type plant, 
was cloned into pENTRYlA. The LR reaction was then 
performed between pGWB502 and the pENTRYlA-SADl. 
The resulting plasmid (pGWB502-SADl) was introduced 
into Agrobacterium tumefaciens GV3101 and transformed 
into sadl mutant plants using the floral dipping method. 
The transformants were selected on a V2 MS medium 
supplemented with 25 (ig/ml hygromycin. Positive trans- 
formants were further confirmed by genotyping using 
the primers 5-CACCGGATCCTGATGGCGAACAATC 
CTTCACAGC-3; 5-TAATGAATTCGATCATTCTCCA 
TCTTCGGGAGACC-3 for SADl cDNA. The con- 
firmed transgenic seedlings (referred to as SADl-OE) 
were used for RNA sequencing and RT-PCR analyses. 

Seeds of C24, sadl and SADl-OE plants were sterilized 
with 50% bleach and 0.01% Triton X-100. The sterilized 
seeds were sown on V2 MS plates supplemented with 3% 
sucrose. After four-day stratification at 4°C, the plates 
were placed under a 16 h-light and 8 h-dark cycle at 21°C 
for germination and seedling growth. Twelve days later. 



the seedlings were treated with H2O (control) or 300 mM 
NaCl for 3 h, and harvested for total RNA extraction. 

RNA extraction, library construction and sequencing 

Using the TRIzol Reagent (15596-026, Invitrogen Co., 
Carlsbad, CA, USA), total RNAs were extracted from 
12-day-old seedlings of wild- type C24, sadl and SADl-OE. 
Polyadenylated RNAs were isolated using the Oligotex 
mRNA Midi Kit (70042, Qiagen Inc., Valencia, CA, USA). 
The RNA-seq libraries were constructed using lUumina 
Whole Transcriptome Analysis Kit following the standard 
protocol (Illumina, HiSeq system) and sequenced on the 
HiSeq platform to generate high-quality single-end reads 
of 101 nucleotides (some with 85 nucleotides due to ma- 
chine failure) in length. 

RNA-sequencing data analysis pipeline 

To analyze RNA-seq data, a pipeline was developed, which 
involved five steps: read alignment and junction pre- 
diction, the filter of false positive junctions, annota- 
tion of AS events, global comparison of AS and the 
identification of differential AS events (for details, see 
Additional file 42). 

Read alignment and junction prediction 

TopHat [21] was used to align the reads against the 
Arabidopsis genome sequences and annotated gene models 
were downloaded from TAIRIO [18] allowing two nucleo- 
tide mismatches. Meanwhile, TopHat was also used to 
predict the splice junctions that did not permit any mis- 
matches in the anchor region of a spliced alignment. 
The splice junctions were classified into known and 
novel splice junctions using the Perl script, which takes 
as input genome coordinates of all annotated exons and 
all predicted splice junctions. In addition, the expression 
levels of transcripts were measured by reads per kilo 
base per million values using the Cufflinks software [47]. 

The filter of false positive junctions 

To estimate thresholds for filtering false positive junc- 
tions, two datasets of random and annotated splice junc- 
tions were first created. The dataset of 80,000 random 
splice junctions was created by joining each annotated 
5' donor sequence (90/74 bp from 5'SSs) and the anno- 
tated 3' donor sequence (90/74 bp from 3'SSs) located 
on a different chromosome (Additional file 42). The 
119,618 annotated splice junctions were created by join- 
ing each annotated 5' donor sequence (90/74 bp from 
5'SSs) and the annotated 3' donor sequence (90/74 bp 
from 3'SSs) in order based on the gene annotation 
(Additional file 42). All splice junctions contained 90/74 
nucleotides of exon sequence on either side of the junc- 
tion to force an alignment overhang of at least 1 1 nucle- 
otides from one side of the splice junction to the other. 
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Then, the mapping software BWA [48] was used to align 
all reads to the random and annotated splice junctions 
that did not permit any mismatches. The alignments to 
random junctions were considered to be false positives, 
because such junctions are thought to rarely exist when 
compared to annotated junctions. We further character- 
ized the false positive junctions, which generally have an 
overhang size of fewer than 20 bp and lower read cover- 
age (Additional file 7A-B). To minimize the false posi- 
tive rate, the overhang size with more than 20 bp and at 
least two reads spanning the junctions were required as 
cutoff value to filter the false positive junctions. 

Annotation of alternative splicing events 

JuncBASE [49] was used for annotating all AS events, 
including cassette exons, alternative 5'SSs, alternative 
3'SSs, mutually exclusive exons, coordinate cassette exons, 
alternative first exons, alternative last exons and intron 
retention, which takes as input genome coordinates of 
all annotated exons and all confidently identified splice 
junctions. Notably, for identifying the events of intron 
retention, we required that at least five reads covered at 
least 50% of the region of one intron. 

Global comparison of alternative splicing 

The global comparison of AS among WT, sadl and 
SADl-OE was started with equally and randomly re- 
sampling uniquely-mapped reads to make sure that the 
comparison was at the same level. The comparison re- 
fers to the two facets: the absolute amount of each type 
of AS event and the number of junction reads that were 
assigned to each type of AS event, because both of them 
can be used to measure the global changes of AS. Mean- 
while, Fisher s exact tests in R [50] were used to identify 
differential representation of each type of AS event, per- 
formed on the number of junction reads that were assigned 
to each type of AS event. 

The identification of differential alternative splicing events 

Fisher s exact tests were also used to identify differential 
representation of each AS event. For alternative 5'SSs 
and 3'SSs and exon-skipping events. Fishers exact tests 
were performed on the comparison of the junction-read 
counts and the corresponding exon-read counts between 
C24 and sadl or SADl-OE. The events with P <0.01 were 
identified as significantly different events. In addition, for 
those AS events that were uniquely identified in C24, 
sadl or SADl-OE, we would consider them significant 
if there were at least five junction reads to support and 
the P value of these events was assigned to equal zero. 
Similarly, for intron retention. Fisher s exact tests were 
performed on the intron-read counts and the correspond- 
ing exon-read counts between C24 and sadl or SADl-OE. 
The events with P <0.001 were identified as significantly 



differential events. In addition, for those intron retention 
uniquely identified in C24 or the mutant, we would con- 
sider them significant if there was at least five-time cover- 
age to support and the P value of these events was assigned 
to equal to zero. 

RT-PCR validation 

The selected AS and intron retention events were validated 
by RT-PCR using a set of primers (Additional file 43) 
that were designed based on each AS event. Total RNAs 
from the C24, sadl and SADl-OE plants were extracted 
using Trizol solution (Invitrogen; cat. 10837-08), treated 
with DNAase I, and reverse-transcribed to cDNA (ran- 
dom priming) by using a standard protocol (Superscript 
II reverse-transcriptase, Invitrogen). 

Quantitative RT-PCR 

For the RT reaction, we used 3 [ig total RNAs from the 
control (H2O) and 300 mM NaCl-treated C24, sadl and 
SADl-OE seedlings. The RT reactions were done with the 
Invitrogen Superscript® III First-Strand Synthesis SuperMix 
in a 20 [A reaction system; the random Hexamer was 
used for first strand synthesis. The RT-solution was 
diluted 10 times, and 1 [A of the solution was used as 
template in 10 [A reaction system with 2 x SYBR Green 
(Invitrogen) Supermix (ROX). The quantitative RT-PCRs 
were performed in triplicate using the ABI 7900HT 
Fast Real-Time PGR System (Applied Biosystems Inc., 
Foster Gity, GA, USA). The primers 5-AAGGAGATAAG- 
GAGGTGGTTGG-3 and 5-ATGTGATGAAGGTTTGTG 
AGG-3 were used for detecting expression levels of SADl, 

Salt-stress tolerance assays 

Surface-sterilized seeds of G24, sadl and SADl-OE were 
sown onto agar plates with MS and 1.2% agar. The 
plates were then kept at 4°G for four days before being 
incubated at 21°G for germination. Four days after ger- 
mination, the seedlings were transferred to MS agar 
plates supplemented with 0, 50, 100 or 200 mM NaGl, 
respectively. The seedlings were then allowed to grow for 
four days, and seedlings were photographed. Root length 
of these seedlings was measured by using Image J [51]. 
For measuring leaf damage, two-week-old seedlings grown 
on MS plates (0.6% agar) were transferred onto Vi MS 
medium plates supplemented with 200 mM NaGl and 
incubated for five days. The number of green leaves and 
yellowish or bleached leaves was counted for each seed- 
ling and percentage of green leaves among total leaves 
was calculated (leaves of the no-salt control treatment 
were all green and were not counted). To further test 
the tolerance to salt stress, seedlings grown on Vi MS 
agar plates were transferred to soil. One week after the 
transfer, the seedlings were irrigated with 50, 100 or 
150 mM NaGl (in 1/8 MS salt), respectively [29]. At 24 days 
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after the transfer, the plants were further irrigated with 
400 mM NaCl (100 ml) and pictures were taken four 
days later. 

Data availability 

The RNA-seq data generated in this work has been sub- 
mitted to the Sequence Read Archive database in NCBL 
The accession number is SRP026082, 

Additional files 



Additional file 1: Mapping results of RNA-seq reads. 

Additional file 2: Distribution of the RNA-seq reads along annotated 
Arabidopsis genomic features. Among reads that unambiguously 
match the Arabidopsis genome, more than 90% of reads match to 
annotated exons. 

Additional file 3: Distribution of the RNA-seq read coverage was plotted 
along the length of the transcriptional unit, x axis indicates the relative 
length of transcripts, and y-axis shows the median depth of coverage. 

Additional file 4: Saturation curve for gene detection. Randomly 
sampled reads were plotted against the expressed genes, x-axis shows 
the number of the mapped reads and y-axis displays the number of the 
expressed genes. 

Additional file 5: Transcription profiles were plotted across the 

Arabidopsis genome. Distribution of RNA-seq read density along 
chromosome length is shown. Each vertical blue bar represents log2 of 
the frequency of reads plotted against chromosome coordinates. A schematic 
drawing of the chromosome and its features is shown below the read density. 
Approximate boundaries of centromeres are depicted in gray. 

Additional file 6: The distinctive features between known and novel 
splice junctions. (A) After comparing all the splice junctions to the gene 
annotation, about 83% of total junctions belong to the annotated junctions, 
and the remaining 17% were assigned to novel junctions. (B) The density of 
overhang size with exon for known and novel splice junctions in each 
sample. X-axis indicates the size of overhang on exon and y-axis indicates 
the density of the sizes. (C) The density of junction read coverage for 
known and novel junctions. 

Additional file 7: The features of false positive (random) and 
annotated junctions. (A) The density of the overhang size of false 
positive and annotated junctions. Most of false positive junctions show 
shorter overhang sizes, while the annotated junctions have larger 
overhang sizes. (B) The density of junction read coverage of false 
positives and annotated junctions. More than half of false positive 
junctions have only one read spanning the junction, while the annotated 
junctions have higher reads coverage. (C) Distinguishing true junctions 
from false positive alignments. To reduce the number of false positive 
junctions, as determined by randomly generated junctions, we required 
that the overhang size must be more than 20 bp (>20 bp) and at least 
two reads (>1 read) span the junctions. Using both criteria, the false 
positive junctions sharply reduced to very low levels (close to zero). By 
contrast, the annotated junctions show no obvious decrease. 

Additional file 8: Annotation of AS events based on all the 
confident junctions. The AS events include cassette exons, alternative 
5'SSs, alternative 3'SSs, mutually exclusive exons, coordinate cassette 
exons, alternative first exons and alternative last exons. 

Additional file 9: Comparison of global AS between the wild type 
and sadi under the control conditions. (A) The counts of each type of 
AS events in the wild type and sadl. The green/blue bars represent 
forward and reverse sequencing reads. (B) The total counts of the splice 
junction reads from each type of AS in the wild type and sadl. The P values 
were calculated by Fisher's exact test comparing the junction read counts 
and the uniquely mapped reads between the wild type and sadl. 

Additional file 10: List of alternative 5' splice sites that were 
over-represented in sadl under the control or NaCI treatment. 



Additional file 11: List of exon-skipping events that were 
over-represented in sadl under the control or NaCI treatment. 

Additional file 12: List of alternative 5' splice sites that were 
over-represented in the wild type under the control or NaCI treatment. 

Additional file 13: List of exon-skipping events that were 
over-represented in the wild type under the control or NaCI treatment. 

Additional file 14: List of alternative 3' splice sites that were 
over-represented in the NaCI-treated sadl mutant. 

Additional file 15: List of alternative 3' splice sites that were 
over-represented in the NaCI-treated wild type. 

Additional file 16: Validation of the AS events in 19 genes by 
RT-PCR and corresponding IGV visualization. These 19 events include 
9 alternative 5'SSs, 4 alternative 3'SSs and 6 exon-skipping events. For the 
validation of alternative 5'SSs and 3'SSs, there was only one band that 
represents the alternative-splice isoform, which was obviously detected in 
sadl mutants, but not or only weakly detected in the wild type and 
SADI -OF. For exon-skipping events, the alternatively spliced forms were 
marked with grey asterisks (*). For IGV visualization, alternative splicing 
sites were marked by red arrows and highlighted by red arcs. 

Additional file 17: The characters of activated alternative 5'SSs in 
sadl under the control conditions. (A) The sequences around the 
alternative 5'SSs that were over-represented in the mutant were shown 
by Weblogo. (B) Distribution of activated alternative 5'SS around the 
dominant ones. These alternative 5'SSs were enriched in the downstream 
or upstream 10 bp region of the dominant 5'SSs (position 0 on thex- axis). 

Additional file 18: Comparison of intron retention between the 
wild-type and sadl plants under the control conditions. The RPKM 
values for the exons and introns were plotted. The expression of introns, 
but not exons, in the sadl mutant showed a global up-regulation. 

Additional file 19: Validation of the intron retention in selected 
genes by RT-PCR using the intron-flanking primers. The intron 
retained splicing variants were marked by grey asterisks (*). 

Additional file 20: List of genes with intron retention in sadl under 
the control or NaCI treatment. 

Additional file 21: List of genes with intron retention in the wild 
type under the control or NaCI treatment. 

Additional file 22: Distribution of the proportions of intron-retained 
transcripts to the total transcripts for each gene with intron-retention 
in the sadl mutants. The percentage is calculated by dividing the RPKM 
value of the retained intron by the RPKM value of the two-flanking exons. 

Additional file 23: Comparison of the total transcripts and 
functional transcripts (without introns) between the wild type and 
sadl. The relative expression of total transcripts was measured as the 
read number of the two exons flanking the retained intron, and the 
relative expression of functional transcripts was calculated by deducting 
the expression of the retained intron (measured by the read number of 
the retained intron) from the expression of the total transcripts. The 
expression levels of the total transcripts did not show obvious change 
between the wild types and sadl, but the functional transcripts tended 
to be down-regulated in the control and NaCI-treated sadl mutants. 

Additional file 24: Expression level of functional transcripts in 
intron-retained genes. 

Additional file 25: Functional category of genes with abnormal 
splicing in the NaCI-treated sadl mutant. 

Additional file 26: Functional category of genes with abnormal 
splicing in the sadl mutant under the control conditions. 

Additional file 27: A two-dimension view of the functional annotations 
of genes with abnormal splicing in sadl under the control conditions. 

The functional classification of genes was done by using the DAVID software. 
The top 50 functional annotations ordered by the enrichment scores were 
selected for the two-dimensional view, which indicates that genes with 
abnormal splicing were strikingly enriched in the response-to-abiotic- 
stress category. 

Additional file 28: A heatmap was generated by mapping the 
genes enriched at the response-to-abiotic-stress pathways to the 
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microarray database at Genevestigator. The heatmap indicates that 
genes with abnormal splicing in the sadl control treatment are not 
specifically related to the response to salt and ABA stress, but rather 
associated with random responses to various environmental stresses. 

Additional file 29: A network generated by Mapman indicates that 
genes with aberrant splicing in the sadl control treatment are 
involved in various stress response pathways, including hormone- 
signaling pathways, MAPK-signaling pathways and transcription 
regulation. 

Additional file 30: Profiling the normalized (by total uniquely 
mapped reads) read coverage of the splice junctions that were 
over-represented in the sadl mutant relative to the wild type and 
SAD1-0E under the control conditions. The profiles indicate that the 
AS patterns in sadl were completely or largely restored by 
overexpressing 5AD1. 

Additional file 31: Number of each type of AS event in the wild 
type and SAD1-0E under the control conditions. Note that the 
numbers of alternative 5'SSs, alternative 3'SSs and exon-skipping events 
in SADl-OE are close to those in the wild type. The green/blue bars repre- 
sent forward and reverse sequencing reads. 

Additional file 32: List of AS events that were significantly absent 
in the NaCI-treated SADl-OE plants. 

Additional file 33: Profiling the normalized (by total uniquely mapped 
reads) read coverage of the introns that were over-represented in the 
sadl mutant relative to the wild type and SADl-OE under the control 
conditions. The profiles indicate that the intron retention events in sadl were 
completely or largely restored by overexpressing SADl 

Additional file 34: List of introns that were significantly absent in 
the NaCI-treated SADl-OE compared to the wild type. 

Additional file 35: Comparison of the total transcripts and functional 
transcripts (without retained intron) between the NaCI-treated 
wild-type and SADl-OE plants. The reads number (log 10) for the total 
transcripts and functional transcripts were plotted between the wild type 
and SADl-OE. The functional transcripts tended to be up-regulated in the 
NaCI-treated SADl-OE plants. 

Additional file 36: Functional category of genes with increased 
splicing efficiency in the NaCI-treated SADl-OE plants. 

Additional file 37: A network generated by Mapman indicates that 
genes with increased splicing efficiency in SAD1-0E are involved in 
various stress response pathways, including hormone-signaling 
pathways, MAPK-signaling pathways and transcription regulation. 

Additional file 38: Relative root length of the wild-type, sadl and 
SADl-OE seedlings after four days growth on 0, 50 or 100 mM NaCI. 

Data are means and standard errors from about 15 seedlings. One-week-old 
seedlings grown on Vi MS medium plates were transferred to Vi MS 
medium plates supplemented with the indicated concentrations of NaCI 
and allowed to grow for four days before measuring the root length. 

Additional file 39: Morphology of 28-day-old wild-type, sadl and 
transgenic (SADl-OE) plants grown under normal conditions 
(without NaCI treatments). 

Additional file 40: Functional category of stress-responsive genes 
with intron retention in the sadl mutant. 

Additional file 41 : List of intron-retained genes that are stress-related 
and are down-regulated in the expression level of the functional 
transcripts. 

Additional file 42: The pipeline of RNA-seq data analysis in this 
study. The pipeline involves five steps: read alignment and junction 
prediction, the filter of false positive junctions, annotation of AS events, 
global comparison of AS and the identification of differential AS events. 
To estimate thresholds for filtering false positive junctions, two datasets 
of random and annotated splice junctions were created. The random 
splice junctions dataset was created by joining each annotated 5' donor 
sequence (90/74 bp from 5'SSs) and the annotated 3' donor sequence 
(90/74 bp from 3'SSs) located on a different chromosome. The annotated 
splice junctions dataset was created by joining each annotated 5' donor 



sequence (90/74 bp from 5'SSs) and the annotated 3' donor sequence 
(90/74 bp from 3'SSs) in order based on the gene annotation. 

Additional file 43: The primers used for RT-PCR to validate 22 AS 
events and 20 intron retention events. 
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